61 research outputs found
Counter Attack on Byzantine Generals: Parameterized Model Checking of Fault-tolerant Distributed Algorithms
We introduce an automated parameterized verification method for
fault-tolerant distributed algorithms (FTDA). FTDAs are parameterized by both
the number of processes and the assumed maximum number of Byzantine faulty
processes. At the center of our technique is a parametric interval abstraction
(PIA) where the interval boundaries are arithmetic expressions over parameters.
Using PIA for both data abstraction and a new form of counter abstraction, we
reduce the parameterized problem to finite-state model checking. We demonstrate
the practical feasibility of our method by verifying several variants of the
well-known distributed algorithm by Srikanth and Toueg. Our semi-decision
procedures are complemented and motivated by an undecidability proof for FTDA
verification which holds even in the absence of interprocess communication. To
the best of our knowledge, this is the first paper to achieve parameterized
automated verification of Byzantine FTDA
Guard Automata for the Verification of Safety and Liveness of Distributed Algorithms
Distributed algorithms typically run over arbitrary many processes and may involve unboundedly many rounds, making the automated verification of their correctness challenging. Building on domain theory, we introduce a framework that abstracts infinite-state distributed systems that represent distributed algorithms into finite-state guard automata. The soundness of the approach corresponds to the Scott-continuity of the abstraction, which relies on the assumption that the distributed algorithms are layered. Guard automata thus enable the verification of safety and liveness properties of distributed algorithms
Network Synchronization in the Crash-Recovery Model
This work investigates the amount of information about failures required to simulate a synchronous distributed system by an asynchronous distributed system prone to crash-recovery failures. A failure detection sequencer SigmaCR for the crash-recovery failure model is defined, which outputs information about crashes and recoveries and about the state of the crashed or recovered processes. Using the simulation technique of a synchronizer, it is shown that in general it is impossible to implement a synchronizer in an asynchronous distributed system with an arbitrary number of concurrent crash-recovery faults. It is shown that a synchronizer is implementable given SigmaCR and an asynchronous distributed system with at least one correct process. Furthermore, it is proven that SigmaCR can be emulated in a synchronous distributed system and hence can be regarded as the weakest failure detection device suitable to implement a synchronizer in the crash-recovery failure model
Reachability in Parameterized Systems: All Flavors of Threshold Automata
Threshold automata, and the counter systems they define, were introduced as a framework for parameterized model checking of fault-tolerant distributed algorithms. This application domain suggested natural constraints on the automata structure, and a specific form of acceleration, called single-rule acceleration: consecutive occurrences of the same automaton rule are executed as a single transition in the counter system. These accelerated systems have bounded diameter, and can be verified in a complete manner with bounded model checking.
We go beyond the original domain, and investigate extensions of threshold automata: non-linear guards, increments and decrements of shared variables, increments of shared variables within loops, etc., and show that the bounded diameter property holds for several extensions. Finally, we put single-rule acceleration in the scope of flat counter automata: although increments in loops may break the bounded diameter property, the corresponding counter automaton is flattable, and reachability can be verified using more permissive forms of acceleration
Guard Automata for the Verification of Safety and Liveness of Distributed Algorithms (long version)
Distributed algorithms typically run over arbitrary many processes and may involve unboundedly many rounds, making the automated verification of their correctness challenging. Building on domain theory, we introduce a framework that abstracts infinite-state distributed systems that represent distributed algorithms into finite-state guard automata. The soundness of the approach corresponds to the Scott-continuity of the abstraction, which relies on the assumption that the distributed algorithms are layered. Guard automata thus enable the verification of safety and liveness properties of distributed algorithms
Programming at the edge of synchrony
International audienceSynchronization primitives for fault-tolerant distributed systems that ensure an effective and efficient cooperation among processes are an important challenge in the programming languages community. We present a new programming abstraction, ReSync, for implementing benign and Byzantine fault-tolerant protocols. ReSync has a new round structure that offers a simple abstraction for group communication, like it is customary in synchronous systems, but also allows messages to be received one by one, like in the asynchronous systems. This extension allows implementing network and algorithm-specific policies for the message reception, which is not possible in classic round models. The execution of ReSync programs is based on a new generic round switch protocol that generalizes the famous theoretical result of ?. We evaluate experimentally the performance of ReSync's execution platform, by comparing consensus implementations in ReSync with LibPaxos3, etcd, and Bft-SMaRt, three consensus libraries tolerant to benign, resp. byzantine faults
LNCS
Fault-tolerant distributed algorithms play an important role in ensuring the reliability of many software applications. In this paper we consider distributed algorithms whose computations are organized in rounds. To verify the correctness of such algorithms, we reason about (i) properties (such as invariants) of the state, (ii) the transitions controlled by the algorithm, and (iii) the communication graph. We introduce a logic that addresses these points, and contains set comprehensions with cardinality constraints, function symbols to describe the local states of each process, and a limited form of quantifier alternation to express the verification conditions. We show its use in automating the verification of consensus algorithms. In particular, we give a semi-decision procedure for the unsatisfiability problem of the logic and identify a decidable fragment. We successfully applied our framework to verify the correctness of a variety of consensus algorithms tolerant to both benign faults (message loss, process crashes) and value faults (message corruption)
- âŠ